35 research outputs found

    PatentSBERTa: A Deep NLP based Hybrid Model for Patent Distance and Classification using Augmented SBERT

    Full text link
    This study provides an efficient approach for using text data to calculate patent-to-patent (p2p) technological similarity, and presents a hybrid framework for leveraging the resulting p2p similarity for applications such as semantic search and automated patent classification. We create embeddings using Sentence-BERT (SBERT) based on patent claims. We leverage SBERTs efficiency in creating embedding distance measures to map p2p similarity in large sets of patent data. We deploy our framework for classification with a simple Nearest Neighbors (KNN) model that predicts Cooperative Patent Classification (CPC) of a patent based on the class assignment of the K patents with the highest p2p similarity. We thereby validate that the p2p similarity captures their technological features in terms of CPC overlap, and at the same demonstrate the usefulness of this approach for automatic patent classification based on text data. Furthermore, the presented classification framework is simple and the results easy to interpret and evaluate by end-users. In the out-of-sample model validation, we are able to perform a multi-label prediction of all assigned CPC classes on the subclass (663) level on 1,492,294 patents with an accuracy of 54% and F1 score > 66%, which suggests that our model outperforms the current state-of-the-art in text-based multi-label and multi-class patent classification. We furthermore discuss the applicability of the presented framework for semantic IP search, patent landscaping, and technology intelligence. We finally point towards a future research agenda for leveraging multi-source patent embeddings, their appropriateness across applications, as well as to improve and validate patent embeddings by creating domain-expert curated Semantic Textual Similarity (STS) benchmark datasets.Comment: 18 pages, 7 figures and 4 Table

    The Privatization of AI Research(-ers): Causes and Potential Consequences -- From university-industry interaction to public research brain-drain?

    Get PDF
    The private sector is playing an increasingly important role in basic Artificial Intelligence (AI) R&D. This phenomenon, which is reflected in the perception of a brain drain of researchers from academia to industry, is raising concerns about a privatisation of AI research which could constrain its societal benefits. We contribute to the evidence base by quantifying transition flows between industry and academia and studying its drivers and potential consequences. We find a growing net flow of researchers from academia to industry, particularly from elite institutions into technology companies such as Google, Microsoft and Facebook. Our survival regression analysis reveals that researchers working in the field of deep learning as well as those with higher average impact are more likely to transition into industry. A difference-in-differences analysis of the effect of switching into industry on a researcher's influence proxied by citations indicates that an initial increase in impact declines as researchers spend more time in industry. This points at a privatisation of AI knowledge compared to a counterfactual where those high-impact researchers had remained in academia. Our findings highlight the importance of strengthening the public AI research sphere in order to ensure that the future of this powerful technology is not dominated by private interests
    corecore